Quantum information can be negative 
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Given an unknown quantum state distributed over two systems, we determine how much 
quantum communication is needed to transfer the full state to one system. This communica- 
tion measures the partial information one system needs conditioned on it's prior information. 
It turns out to be given by an extremely simple formula, the conditional entropy. In the classi- 
cal case, partial information must always be positive, but we find that in the quantum world 
this physical quantity can be negative. If the partial information is positive, its sender needs 
to communicate this number of quantum bits to the receiver; if it is negative, the sender and 
receiver instead gain the corresponding potential for future quantum communication. We in- 
troduce a primitive quantum state merging which optimally transfers partial information. We 
show how it enables a systematic understanding of quantum network theory, and discuss sev- 
eral important applications including distributed compression, multiple access channels and 
multipartite assisted entanglement distillation (localizable entanglement). Negative channel 
capacities also receive a natural interpretation. 

'Ignorance is strength' is one of the three cyn- of information originating from a source is the 
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ical mottos of Big Brother in George Orwell's memory required to faithfully represent its out- 

1984. Most of us would naturally incline to put. For the case of a statistical source, on which 

the opposite view, trying continually to increase we will concentrate throughout, this amount is 

our knowledge on just about everything. But re- given by its entropy. 
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$h 1 gardless of preferences, we are thus confronted 

by two questions: how much is there to know? Xo ap p r0 ach the second question, let us intro- 

And, how large is our ignorance in a given situ- duce a tW o-player game. One participant (Bob) 

at i° n ? has some incomplete prior information Y, the 

other (Alice) holds some missing information 

The reader will observe that the formulation X: we think of X and Y as random variables, 

of these questions addresses the quantity of in- and Bob has prior information due to possible 

formation, not its content, and this is simply be- correlations between X and Y. If Bob wants to 

cause the latter is hard to assess and to com- learn X, how much additional information does 

pare. The former approach to classical informa- Alice need to send him? This is one of the key 

tion was pioneered by Claude Shannon 1 , who problems of classical information theory, since 

provided the tools and concepts to scientifically it describes a ubiquitous scenario in information 

answer the first of our two questions: the amount networks. It was solved by Slepian and Wolf 2 
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who proved that the amount of information that 
Bob needs is given by a quantity called the con- 
ditional entropy. It measures the partial infor- 
mation that Alice must send to Bob so that he 
gains full knowledge of X given his previous 
knowledge from Y, and it is just the difference 
between the entropy of (X, Y) taken together 
(the total information) and the entropy of Y (the 
prior information). Of course, this partial infor- 
mation is always a positive quantity. Classically, 
there would be no meaning to negative informa- 
tion. 

In the quantum world, the first of our 
two questions, how to quantify quantum infor- 
mation, was answered by Schumacher 3 , who 
showed that the minimum number of quantum 
bits required to compress quantum information 
is given by the quantum (von Neumann) en- 
tropy. To answer the second question, let us 
now consider the quantum version of the two- 
party scenario above: Alice and Bob each pos- 
sess a system in some unknown quantum state 
with the total density operator being pab and 
each party having states with density operators 
p A and p B respectively. The interesting case is 
where Bob is correlated with Alice, so that he 
has some prior information about her state. We 
now ask how much additional quantum informa- 
tion Alice needs to send him, so that he has the 
full state (with density operator pab)- Since we 
want to quantify the quantum partial informa- 
tion, we are interested in the minimum amount 
of quantum communication to do this, allowing 
unlimited classical communication - the latter 
type of information being far easier to transmit 
than the former as it can be sent over a telephone 
while the former is extremely delicate and must 
be sent using a special quantum channel. 

Since we are interested in informational 
quantitities, we go to the limit of many copies 



of state pab and vanishing but non-zero errors 
in the protocol. We find here that the amount of 
partial quantum information that Alice needs to 
send Bob is given by the quantum conditional 
entropy, which is exactly the same quantity as in 
the classical case but with the Shannon entropy 
changed to the von Neumann entropy: 

S(A\B) = S(AB)-S(B) , (1) 

where S(B) is the entropy of Bob's state p B 
and S(AB) is the entropy of the joint state pab- 
For quantum states, the conditional entropy can 
be negative 4-6 , and thus it is rather surprising 
that this quantity has a physical interpretation 
in terms of how much quantum communication 
is needed to gain complete quantum information 
i.e., possession of a system in the total state p A B- 

However, in the above scenario, the nega- 
tive conditional entropy can be clearly inter- 
preted. We find that when S(A\B) is nega- 
tive, Bob can obtain the full state using only 
classical communication, and additionally, Al- 
ice and Bob will have the potential to transfer 
additional quantum information in the future at 
no additional cost to them. Namely, they end 
up sharing —S(A\B) Einstein-Podolsky-Rosen 
(EPR) pairs 7 , i.e. pure maximally entangled 
states -^(|00) + |11)), which can be used to 
teleport 8 quantum states between the two par- 
ties using only classical communication. Nega- 
tive partial information thus also gives Bob the 
potential to receive future quantum information 
for free. The conditional entropy plays the same 
role in quantum information theory as it does in 
the classical theory, except that here, the quan- 
tum conditional entropy can be negative in an 
operationally meaningful way. One could say 
that the ignorance of Bob, the conditional en- 
tropy, if negative, precisely cancels the amount 
by which he knows too much 9 ; the latter being 
just the potential future communication gained. 
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This solves the well known puzzle of how to 
interpret the quantum conditional entropy which 
has persisted despite interesting attempts to un- 
derstand it 6 . Since there are no conditional prob- 
abilities for quantum states, S(A\B) is not an 
entropy as in the classical case. But by going 
back to the definition of information in terms of 
storage space needed to hold a message or state, 
one can make operational sense of this quantity. 

Let us now turn to the protocol which allows 
Alice to transfer her state to Bob's site in the 
above scenario (we henceforth adopt the com- 
mon usage of refering directly to manipulations 
on "states" - meaning a manipulation on a phys- 
ical system in some quantum state). We call 
this quantum state merging, since Alice is ef- 
fectively merging her state with that of Bob's. 
Let us recall that in quantum information the- 
ory, faithful state transmission means that while 
the state merging protocol may depend on the 
density operator of the source, it must succeed 
with high probability for any pure state sent. 
An equivalent and elegant way of expressing 
this criterion is to imagine that pab is part of 
a pure state \iP)abr, which includes a reference 
system R. Alice's goal is to transfer the state 
Pa to Bob, and we demand that after the pro- 
tocol, the total state still has high fidelity with 
I^abr (meaning they are nearly identical); see 
Figure 1 which includes a high-level description 
of the protocol. The essentially element of state 
merging is that p R must be unchanged, and Al- 
ice must decouple her state from R. This also 
means (seemingly paradoxically) that as far as 
any outside party is concerned, neither the clas- 
sical nor quantum communication is coupled 
with the merged state. 

Let us now consider three instructive and simple 
examples: 
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Figure 1: Diagrammatic representation of the 
process of state merging. Initially the state \i>) is 
shared between the three systems R(eference), 
A(lice) and B(ob). After the communication Al- 
ice's system is in a pure state, while Bob holds 
not only his but also her initial share. Note 
that the reference's state p R has not changed, 
as indicated by the curve separating R from 
AB. The protocol for state merging is as fol- 
lows: Let Alice and Bob have a large number 
n of the state pab- To begin, we note that we 
only need to describe the protocol for negative 
S(A\B), as otherwise Alice and Bob can share 
nS{A\B) EPR pairs (by sending this number 
of quantum bits) and create a state \iP)aa>bb>r 
with S(AA'\BB') < 0. This is because adding 
an EPR pair reduces the conditional entropy by 
one unit. However, S(A\B) < is equivalently 
expressed as S(A) > S(AB) = S(R), and it 
is known 10-12 that measurement in a uniformly 
random basis on Alice's n systems projects Bob 
and R into a state \<p)br whose reduction to R 
is very close to pr. But this means that Bob 
can, by a local operation, transform \ip) BR to 
\iP)abr- Finally, by coarse-graining the random 
measurement, Alice essentially projects onto a 
good quantum code 10-12 of rate —S{A\B); this 
still results in Bob obtaining the full state pab, 
but now, just under -nS(A\B) EPR pairs are 
also created. These codes can also be obtained 
by an alternative construction 13 . 
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1. Alice has a completely unknown state 
which we can represent as the maximally 
mixed density matrix p A = |(|0)(0|a + 

and Bob has no state (or a known 
state |0) B ). In this case, S(A\B) = 1 and 
Alice must send one qubit down the quan- 
tum channel to transfer her state to Bob. 
She could also send half of an EPR state 
to Bob, and use quantum teleportation 8 to 
transfer her state. 

2. The classically correlated state pab = 

|(|00>(00U + |11)(HUb). We imag- 
ine this state as being part of a pure state 
with the reference system R, \^)abr = 
75(|0)a|0) b |0)«+ |l) A |l) B |l) fl ). In this 
case, S(A\B) = 0, and thus no quantum 
information needs to be sent. Indeed, Alice 
can measure her state in the basis |0) ± 1 1), 
and inform Bob of the result. Depending on 
the outcome of the measurement, Bob and 
R will share one of two states |0 ± )br = 
^(|0) B |0) fl ± |1)b|1)a), and by a local 
operation, Bob can always transform the 
state to \ip) A >BR = ^(\0)a>\0)b\0)r + 
|1)a'|1)b|1).r) with A' being an ancilla at 
Bob's site. Alice has thus managed to send 
her state to Bob, while fully preserving 
their entanglement with R. 

3. For the state = ^(|0) A |0) B + 

|1)a|1)b), S(A\B) = -1, and Alice and 
Bob can keep this shared EPR pair to al- 
low future transmission of quantum infor- 
mation, while Bob creates the EPR pair 
\4> + )a'b locally. I.e. transferring a pure 
state is trivial since the pure state is known 
and can be created locally. 



the number of quantum codes in Alice's pro- 
jection: the quantum mutual information I(A : 
R) = S(A) + S(R) - S(AR) between Alice 
and the reference R. Secondly, the measure- 
ment of Alice makes her state completely prod- 
uct with R, thus reinforcing the interpretation 
of quantum mutual information as the minimum 
entropy production of any local decorrelating 
process 14 ' 15 . This same quantity is also equal to 
the amount of irreversibility of a cyclic process: 
Bob initially has a state, then gives Alice her 
share (communicating S(A)), which is finally 
merged back to him (communicating S(A\B)). 
The total quantum communication of this cycle 
is I (A : R) quantum bits. 

Because state merging is such a basic prim- 
itive, it allows us to solve a number of other 
problems in quantum information theory fairly 
easily. We now sketch four particularly striking 
applications. 

Distributed quantum compression: for a sin- 
gle party, a source emitting states with density 
matrix pa can be compressed at a rate given 
by the entropy S(A) of the source by perform- 
ing quantum data compression 3 . Let us now 
consider the distributed scenario - we imagine 
that the source emits states with density matrix 
PA 1 A 2 ...A m , and distributes it over m parties. The 
parties wish to compress their shares as much 
as possible so that the full state can be recon- 
structed by a single decoder. Until now the 
general solution of this problem has appeared 
intractable 16 , but it becomes very simple once 
we allow classical side information for free, and 
use state merging. 



Remarkably, the parties can compress the 
Let us now make a couple of observations state at tne total rate S(A 1 A 2 . . . A m ) - the 
about state merging. First, the amount of clas- Schumacher limit 3 for collective compression - 
sical communication that is required is given by even thou g h the y must operate seperately. This 
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is analogous to the classical result, the Slepian- 
Wolf theorem 2 . We describe the quantum solu- 
tion for two parties and depict the rate region in 
Figure 2. 



Noiseless coding with side information: re- 
lated to distributed compression is the case 
where only Alice's state needs to arrive at the 
decoder, while Bob can send part of his state 
to the decoder in order to help Alice lower her 
rate. The classical case of this problem was in- 
troduced by Wyner 17 . For the quantum case, we 
demand that the full state pab be preserved in 
the protocol, but do not place any restriction on 
what part of Bob's state may be at the decoder 
and what part can remain with him. For one- 
way protocols, we find using state merging that 
if pa and ps are encoded at rates R a and R b re- 
spectively, then the decoder can recover pa if 
and only if R a > S(A\U) and R b > E p (AU : 
R) — S(A\U) with R being the purifying ref- 
erence system, U being a system with its state 
produced by some quantum channel on ps, and 
E P (AU : R) = mm A S(AA(U)) being the en- 
tanglement of purification 18 . The mimimum is 
taken over all channels A acting on U. 

Quantum multiple access channel: in addi- 
tion to the central questions of information the- 
ory we asked earlier, how much is there to 
know, and how great is our ignorance, informa- 
tion theory also concerns itself with communi- 
cation rates. In the quantum world, the rate at 
which quantum information can be sent down a 
noisy channel is related to the coherent infor- 
mation I(A)B) which was previously defined 
as 19 max{S(B) - S(AB),0}. This quantity 
is the quantum counterpart of Shannon's mu- 
tual information; when maximized over input 
states, it gives the rate at which quantum in- 
formation can be sent from Alice to Bob via a 
noisy quantum channel 10-12 . As with the classi- 




Figure 2: The rate region for distributed com- 
pression by two parties with individual rates Ra 
and R B - The total rate Rab is bounded by 
S(AB). The top left diagram shows the rate 
region of a source with positive conditional en- 
tropies; the top right and bottom left diagrams 
show the purely quantum case of sources where 
S(B\A) < or S(A\B) < 0. It is even possible 
that both S(B\A) and S(A\B) are negative, as 
shown in the bottom right diagram, but observe 
that the rate-sum S(AB) has to be positive. If 
one party compresses at a rate S(B), then the 
other party can over-compress at a rate S(A\B), 
by merging her state with the state which will 
end up with the decoder. Time-sharing gives the 
full rate region, since the bounds evidently can- 
not be improved. Analogously, for m parties A it 
and all subsets T c {1, 2, . . . ,m} holding a com- 
bined state with entropy S(T), the rate sums 
Rt = Eier r a, have to obey R T > S(T\f) with 
f = {1, 2, . . . , m} \ T the complement of set T. 
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cal conditional entropy, Shannon's classical mu- 
tual information 1 is always positive, and indeed 
it makes no sense to have classical channels with 
negative capacity. However, the relationship be- 
tween the coherent information and the quantum 
channel capacity contained a puzzle. As the lat- 
ter was thought to be meaningful as a positive 
quantity, the former was defined as the maxi- 
mum of and S(B) — S(AB) since it could be 
negative. 

We will see that negative values do make 
sense, and thus propose that I(A)B) should not 
be defined as above, but rather as I(A)B) = 
—S(A\B). It turns out that negative capacity, 
impossible in classical information theory has 
its interpretation in a situation with two senders. 

We imagine that Alice and Bob wish to send 
independent quantum states to a single decoder 
Charlie via a noisy channel which acts on both 
inputs. This problem is considered by Yard et 
al. 20 . Our approach using state merging pro- 
vides a solution also when either of the channel 
capacities are negative, and gives the following 
better achievable rates: 

R A < I(A)CB) , 
Rb < I(B)CA) , 
Ra + Rb < I(AB)C) , (2) 

where R A and R B are the rates of Alice and Bob 
for sending quantum states. Here, we use our re- 
definition of the coherent information, in that we 
allow it to be negative. In achieving these rates, 
one party can send (or invest) I(A)C) quantum 
bits to merge her state with the decoder. The sec- 
ond party then already has Alice's state at the de- 
coder, and can send at the higher rate 1(B) AC). 
This provides an interpretation of negative chan- 
nel capacities: if the channel of one party has 
negative coherent information, this means that 
she has to invest this amount of entanglement to 



help her partner achieve the highest rate. The 
protocol is for one of the parties to merge their 
state with the state held by the decoder. The ex- 
pressions in © are in formal analogy with the 
classical multiple access channel. 

Entanglement of assistance (localizable en- 
tanglement): consider Alice, Bob and m — 2 
other parties sharing (many copies of) a pure 
quantum state. The entanglement of assistance 
Ea 11 is defined as the maximum entanglement 
that the other parties can create between Al- 
ice and Bob by local measurements and clas- 
sical communication. For many parties, this is 
often referred to as localizable entanglement 22 , 
although here we work in the regime of many 
copies of the shared state. This problem was re- 
cently solved for up to four parties 23 , and can be 
generalized to an arbitrary number m of parties 
using state merging (using universal codes de- 
pending only on the density matrix of the helper, 
as described in Figure 1). We find that the max- 
imal amount of entanglement that can be dis- 
tilled between Alice and Bob, with the help of 
the other parties, is given by the minimum en- 
tanglement across any bipartite cut of the system 
which separates Alice from Bob: 

E A = mm{S(AT),S(BT)} (3) 

where the minimum is taken over all possible 
partitions of the other parties into groups T and 
its complement T — {1, . . . , m — 2} \ T. To 
achieve this, each party in turn merges their state 
with the remaining parties, preserving the mini- 
mum cut entanglement. 

We have described a fundamental quan- 
tum information primitive, state merging, and 
demonstrated some of its many applications. 
There are also conceptual implications. For 
example, the celebrated strong subadditivity 
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of quantum entropy 24 , S(A\BC) < S(A\B), 
receives a clear interpretation and transparent 
proof: having more prior information makes 
state merging cheaper. Our results also shed new 
light on the foundations of quantum mechanics: 
it has long been known that there are no condi- 
tional probabilities, so defining conditional en- 
tropy is problematic. Just replacing classical 
entropy with quantum entropy gives a quan- 
tity which can be positive or negative. Quite 
paradoxically, only the negative part was under- 
stood operationally, as quantum channel capac- 
ity, which, if anything, made the problem even 
more obscure. State merging "annihilates" these 
problems with each other. It turns out that the 
puzzling form of quantum capacity as a condi- 
tional entropy is just the flip-side of our interpre- 
tation of quantum conditional entropy as partial 
quantum information, which makes equal sense 
in the positive and negative regime. The key 
point is to realize that in the negative regime, 
one can gain entanglement and transfer Alice's 
partial state, while in the positive regime, only 
the partial state is transfered. 

On a last note, we wish to point out that re- 
markably and despite the formal analogy, the 
classical scenario does not occur as a classi- 
cal limit of the quantum scenario - we consider 
both classical and quantum communication, and 
there is no meaning to preserving entanglement 
in the classical case. We have only just begun to 
grasp the full implications of state merging and 
negative partial information; a longer technical 
account 25 , with rigorous proofs and further ap- 
plications is in preparation. 
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